Residual noise compensation for robust speech recognition in nonstationary noise
نویسندگان
چکیده
We present a model-based noise compensation algorithm for robust speech recognition in nonstationary noisy environments. The effect of noise is split into a stationary part, compensated by parallel model combination, and a time varying residual. The evolution of residual noise parameters is represented by a set of state space models. The state space models are updated by Kalman prediction and the sequential Maximum Likelihood algorithm. Prediction of residual noise parameters from different mixtures are fused, and the fused noise parameters are used to modify linearized likelihood score of each mixture. Noise compensation proceeds in parallel with recognition. Experimental results demonstrate that the proposed algorithm improves recognition performance in highly nonstationary environments, compared with parallel model combination alone.
منابع مشابه
Residual noise compensation by a sequential EM algorithm for robust speech recognition in nonstationary noise
We model noise as a stationary component plus a time varying residual. The stationary part is estimated off-line and compensated using Log-Add noise compensation. The time varying residual is estimated and compensated using a sequential EM algorithm. The residual noise compensation proceeds in parallel with the recognition process. Experimental results demonstrate that the proposed algorithm im...
متن کاملJoint model and feature based compensation for robust speech recognition under non-stationary noise environments
This paper presents a novel compensation approach, which is implemented in both model and feature spaces, for nonstationary noise Due to the nature of non-stationary noise which can be decomposed into constant part and residual noise part, our proposed scheme is performed in two steps: before recognition, an extended Jacobian adaptation (JA) is applied to adapt the speech models for the constan...
متن کاملروشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه
Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...
متن کاملImproving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملRobust Speech Recognition in a Car Using a Microphone Array
Performance of automatic speech recognition relies on a vast amount of training speech data mostly recorded with little or no background noise. The performance degrades significantly with existence of background noise, which increases type mismatch between train and test environments. Speech enhancement techniques can reduce the amount of type mismatch by extracting reliable speech features fro...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000